Goto

Collaborating Authors

 multiple object tracking



Is Multiple Object Tracking a Matter of Specialization?

Neural Information Processing Systems

End-to-end transformer-based trackers have achieved remarkable performance on most human-related datasets. However, training these trackers in heterogeneous scenarios poses significant challenges, including negative interference - where the model learns conflicting scene-specific parameters - and limited domain generalization, which often necessitates expensive fine-tuning to adapt the models to new domains. In response to these challenges, we introduce Parameter-efficient Scenario-specific Tracking Architecture (PASTA), a novel framework that combines Parameter-Efficient Fine-Tuning (PEFT) and Modular Deep Learning (MDL). Specifically, we define key scenario attributes (e.g, camera-viewpoint, lighting condition) and train specialized PEFT modules for each attribute. These expert modules are combined in parameter space, enabling systematic generalization to new domains without increasing inference time.


Is Multiple Object Tracking a Matter of Specialization?

Mancusi, Gianluca, Bernardi, Mattia, Panariello, Aniello, Porrello, Angelo, Cucchiara, Rita, Calderara, Simone

arXiv.org Artificial Intelligence

End-to-end transformer-based trackers have achieved remarkable performance on most human-related datasets. However, training these trackers in heterogeneous scenarios poses significant challenges, including negative interference - where the model learns conflicting scene-specific parameters - and limited domain generalization, which often necessitates expensive fine-tuning to adapt the models to new domains. In response to these challenges, we introduce Parameter-efficient Scenario-specific Tracking Architecture (PASTA), a novel framework that combines Parameter-Efficient Fine-Tuning (PEFT) and Modular Deep Learning (MDL). Specifically, we define key scenario attributes (e.g., camera-viewpoint, lighting condition) and train specialized PEFT modules for each attribute. These expert modules are combined in parameter space, enabling systematic generalization to new domains without increasing inference time. Extensive experiments on MOT-Synth, along with zero-shot evaluations on MOT17 and PersonPath22 demonstrate that a neural tracker built from carefully selected modules surpasses its monolithic counterpart. We release models and code.


MAML MOT: Multiple Object Tracking based on Meta-Learning

Chen, Jiayi, Deng, Chunhua

arXiv.org Artificial Intelligence

With the advancement of video analysis technology, the multi-object tracking (MOT) problem in complex scenes involving pedestrians is gaining increasing importance. This challenge primarily involves two key tasks: pedestrian detection and re-identification. While significant progress has been achieved in pedestrian detection tasks in recent years, enhancing the effectiveness of re-identification tasks remains a persistent challenge. This difficulty arises from the large total number of pedestrian samples in multi-object tracking datasets and the scarcity of individual instance samples. Motivated by recent rapid advancements in meta-learning techniques, we introduce MAML MOT, a meta-learning-based training approach for multi-object tracking. This approach leverages the rapid learning capability of meta-learning to tackle the issue of sample scarcity in pedestrian re-identification tasks, aiming to improve the model's generalization performance and robustness. Experimental results demonstrate that the proposed method achieves high accuracy on mainstream datasets in the MOT Challenge. This offers new perspectives and solutions for research in the field of pedestrian multi-object tracking.


Understanding Multiple Object Tracking using DeepSORT

#artificialintelligence

Surveillance cameras plays an essential role in securing our home or business. These cameras are super affordable. So is setting up a surveillance system. The only difficult and expensive part is the monitoring. For real time monitoring, usually a security personnel or a team has to be assigned. It is simply not feasible for all.


GAKP: GRU Association and Kalman Prediction for Multiple Object Tracking

Li, Zhen, Cai, Sunzeng, Wang, Xiaoyi, Liu, Zhe, Xue, Nian

arXiv.org Artificial Intelligence

Multiple Object Tracking (MOT) has been a useful yet challenging task in many real-world applications such as video surveillance, intelligent retail, and smart city. The challenge is how to model long-term temporal dependencies in an efficient manner. Some recent works employ Recurrent Neural Networks (RNN) to obtain good performance, which, however, requires a large amount of training data. In this paper, we proposed a novel tracking method that integrates the auto-tuning Kalman method for prediction and the Gated Recurrent Unit (GRU) and achieves a near-optimum with a small amount of training data. Experimental results show that our new algorithm can achieve competitive performance on the challenging MOT benchmark, and faster and more robust than the state-of-the-art RNN-based online MOT algorithms.


GMOT-40: A Benchmark for Generic Multiple Object Tracking

Bai, Hexin, Cheng, Wensheng, Chu, Peng, Liu, Juehuan, Zhang, Kai, Ling, Haibin

arXiv.org Artificial Intelligence

Multiple Object Tracking (MOT) has witnessed remarkable advances in recent years. However, existing studies dominantly request prior knowledge of the tracking target, and hence may not generalize well to unseen categories. In contrast, Generic Multiple Object Tracking (GMOT), which requires little prior information about the target, is largely under-explored. In this paper, we make contributions to boost the study of GMOT in three aspects. First, we construct the first public GMOT dataset, dubbed GMOT-40, which contains 40 carefully annotated sequences evenly distributed among 10 object categories. In addition, two tracking protocols are adopted to evaluate different characteristics of tracking algorithms. Second, by noting the lack of devoted tracking algorithms, we have designed a series of baseline GMOT algorithms. Third, we perform a thorough evaluation on GMOT-40, involving popular MOT algorithms (with necessary modifications) and the proposed baselines. We will release the GMOT-40 benchmark, the evaluation results, as well as the baseline algorithm to the public upon the publication of the paper.


Graph Neural Networks for Multiple Object Tracking

#artificialintelligence

Multiple object tracking(MOT) is the task of studying object appearance and movements to analyze their trajectories. For a given input video the algorithm is supposed to output which portions of the image represent the same object in different frames of the video. Algorithms like these can be used to solve some exciting problems like analyzing a particular soccer player's movements during the game, predicting whether a person is going to cross the street or not, or to track and analyze the movement of microscopic organisms in time-lapse microscopy images, etc. In this article, we will go through a state of the art Offline tracking framework for solving the problem of MOT. The approach that we are about to discuss was published in a paper by the researchers at the Dynamic Vision and Learning Group at TUM. Their proposed algorithm achieved SOTA on MOT15, MOT16, and MOT17 challenges.


Inference for multiple object tracking: A Bayesian nonparametric approach

Moraffah, Bahman

arXiv.org Machine Learning

In recent years, multi object tracking (MOT) problem has drawn attention to it and has been studied in various research areas. However, some of the challenging problems including time dependent cardinality, unordered measurement set, and object labeling remain unclear. In this paper, we propose robust nonparametric methods to model the state prior for MOT problem. These models are shown to be more flexible and robust compared to existing methods. In particular, the overall approach estimates time dependent object cardinality, provides object labeling, and identifies object associated measurements. Moreover, our proposed framework dynamically contends with the birth/death and survival of the objects through dependent nonparametric processes. We present Inference algorithms that demonstrate the utility of the dependent nonparametric models for tracking. We employ Monte Carlo sampling methods to demonstrate the proposed algorithms efficiently learn the trajectory of objects from noisy measurements. The computational results display the performance of the proposed algorithms and comparison not only between one another, but also between proposed algorithms and labeled multi Bernoulli tracker.